HMM based Automatic Speech Recognition Analysis
نویسنده
چکیده
This project's 'HMM Based Automatic Speech Recognition Analysis main motive is just to generate an Automatic speech recognition which is clear an accurate using Hidden Markov Model (HMM) to get accurate results at number of frequency ranges related to human voice. Here is a record of 12 different words which is recorded by using a number of different speakers that includes male and female both (especially female). Thereafter, the speech recognitions result reports are compared with different feature extraction methods in this project instead of one method. Because, an Earlier research work on this project thesis only one feature extraction method has been used and also using a recognition of seven small vocal sounds using HMM (Hidden Markov Model). This speech recognition system mainly divided into two major blocks here in this project. First Block includes the recording data base and feature extraction of all recorded signals. Here we use Mel frequency cepstral coefficients (MFCC), linear cepstral coefficients and fundamental frequency as feature extraction methods instead of one extraction method which were used earlier. For obtaining a Mel frequency cepstral coefficients (MFCC), a signal is passing through following parameters named as pre emphasis, framing, applying window function, Fast Fourier transform, filter bank and then discrete cosine transform, where as a linear frequency cepstral coefficients does not use Mel frequency. Now the Second part includes the description of HMM used for modeling and recognizing the spoken words. All the raining samples are clustered using K-means algorithm. Gaussian mixture containing mean, variance and weight are modeling parameters. Here is also a role of Baum Welch algorithm. it is used for training the samples and reestimate the parameters. Finally in the thesis, the Viterbi algorithm recognizes best sequence that exactly matches for given sequence there is given during speech vocal sounds which has to be recognized. Here all the simulations are done by using the MATLAB tool. Keywords---MATLAB, Rule Viewer, Operating System window 7, HMM, MFCC, Window Techniques, Feature Extraction Methods.
منابع مشابه
Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملتخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت
The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...
متن کاملDecision Tree Clustering for Kl-hmm
Recent Automatic Speech Recognition (ASR) studies have shown that Kullback-Leibler diverge based hidden Markov models (KL-HMMs) are very powerful when only small amounts of training data are available. However, since the KL-HMMs use a cost function that is based on the Kullback-Leibler divergence (instead of maximum likelihood), standard ASR algorithms such as the commonly used decision tree cl...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کامل